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ABSTRACT: Misunderstanding about the term "random samples" and its implications may 
easily arise. Conditions under which the phases, obtained from arrival times, do not form a 
random sample and the dangers involved are discussed. Watson s U J test (or uniformity is 
recommended for light curves with duty cycles larger than 10%. Under certain conditions, 
non-parametric density estimation may be used to determine estimates of the true light curve 
and its parameters. 

• 

1. INTRODUCTION: Consider a series of arrival times t j,i = 1 N, of y-rays from a certain 

source direction. The case is studied where the data contains a periodic component of strength 
p (pulsed counts/total counts) and period T. In the case of detectors with low count rates, the 
obvious requirement is to determine the significance of p as being due to a periodic source 
against the possibility that it is only a statistical fluctuation from the uniform background. The 
deduction of a possible light curve is also important. In this paper the following points are 
covered: (1) the problem of "random samples", (2) tests for uniformity, (3) non-parametric 
density estimators of the true periodic light curve and (4) the determination of the light curve, 
parameters from the non-parametric density estimator. 

2. THE PROBLEM OF "RANDOM SAMPLES": The measured data are the arrival times with 

the property t j >t i _ 1 (i=2 N). Assume this process, apart from the periodic component in the 

data, to be time independent. It is desirable to estimate the true light curve from the arrival times. 
This is done by folding the t t 's modulo 2rr, with respect to a known period T. This results in the 
"sample" (0, ty, with t^the so called phases which are calculated as 

0| ■ — (mod 2 ii ) « 2 n — - - k , i *■ 1 N, -kc W 1^1 

The choice of 2rr is to allow the application of trigonometric functions on the phases. This 
sample has mostly been treated as being random. This sample would be random if and only if 
(a) all the 0/s are identically distributed and (b) if they are statistically independent. If the 
phases do not form a random sample, then no conclusions about the "true underlying light 
curve" can be made. The fact is that the phases do not form a random samplel This can be seen 
as follows: 


From eq. (1) the probability density functions (p.d.f) of f^and ^are related by the following 

wrapping process (Mardia, 1972): n 

f 0 (o) * ~~ >: f. (£ ♦ Tk (2) 

°i 2 ti k=Q 1 1 ( 2ir J 

Since t,>L ,.it follows that fn t (0) ^fo-jfO) for every 0 and all i/j, thus proving that the «i's are 
not identically distributed. Furthermore tptn-t- TU-t^, which implies that tj is a function of 
L ..Since 0, is a function of t, it follows that ()j is also a function of Oj-i-This shows that the 
phases are not independently distributed. It should however be noted that if the time differences 
v i = tj'ti-l are used, a random sample would result by folding the vi's. 

From simulations of arrival times the following seems evident (let b=E(u-ti-j) si /count rate): 
The distributions become approximately identical when T<b. If T«b, then it suffices to add a 
constant large time to each ti, so that t,> >0. This will ensure almost identically distributed 
random phases. If the period T equals the whole period of observation {T>>bl, then 

0, and f 0| (0) - £f t) <0> O) 

so that the phases are not identically distributed. I he "runs-test (Lindgren, 1976) was used 
to determine whether the phases are independently distributed: For T<b, the phases seem to 
be independent random variables and for T>>b there was strong evidence for dependency, 
which is also clear from eq. (3). Independency can with a 10% uncertainty be accepted for 
T<3b. This result seems to be independent of the pulsed fraction. 
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Thus, for T>3b, the true light curve cannot be estimated. In y-ray astronomy this problem 
amounts to the case of astrophysical objects with periods that is large in comparison with b. 

3. TESTS FOR UNIFORMITY: Let 0 be a random variable with p.d.f. f(0), which is assumed 
to be unknown. An appropriate test in this case would be some non-parametric test: 

H o : f (0) « U (0) =l/2n against H ( : f ( 0) \ U(0) ‘ (4) 

The alternative hypothesis Hx only suggests that the unknown p.d.f. is different from uniformity. 
In order to compare tests, the following general joynTof f(U), which covers most cases in y-ray 
astronomy, was assumed: 

f(0) = p , s(0 ; 6 ^ + p z s(0; u 2 , fl ? ) + -y’ --- 2 - (5) 

The pulsed fraction and phase (mean position) of each peak are denoted for i = 1,2 by p^and 
Px respectively, while refers to the FWHM of each peak, divided by the period T. 1 

The two most commonly used tests for uniformity in y-ray astronomy are: 

1 ) x 2 -test: The advantage of this test is that it is a non-parametric test, but its drawback is the 
choice of the number of bijjs K (^degrees of freedom+1 ) and their positions on the phasogram. 
The best choice for K is 1 /8, where 5 is some estimate of 8. From simulations it was evident that 
the sensitivity of this test increases with decreasing duty cycles. 

2) Rayleigh test (Mardia, 1972): The motivation for the use of this test is its independence of 
bins. It is however a parametric test that was derived for von Mises alternatives. This 
corresponds to pi=1, p2=0 and S(0;pi,5i) the von Mises distribution M(0;p,K). In the case of 
bimodal data (as with certain pulsars), the phase difference is «0,42 and the value of the test 
statistic R is small when Pj&P? This is the result of two nearly opposing vectors, cancelling each 
other, when the test statistic 

=_ , N N 

R " W 2 + S2) with C ■= -jj Z cos 0.; 5 =1 Z sinO, (6) 

N i = 1 1 N j = | 1 

is computed. Consequently bimodal data may be interpreted by this test as being uniform and 
real sources could then be discarded. 


Two somewhat neglected 
non-parametric tests in this area of 
research are Kuiper's V^test and 
Watson's U 2 test. Their distributions 
under H 0 with the corresponding critical 
values are discussed by Mardia (1972). 
A brief outline of each test's algorithm 
is as follows: 


3) Kuiper's V N test: Let 0(j),...,0( N )be the 
ordered phases. With Ux=0 (i ^2n:, the test 
statistic is computed by 


V = n ' ax (u - A) _ m±n . _i, , 1 ( *7 \ 

N i (u i n' i lU i N n m 


so that only the minimum and maximum 
deviations from the uniform distribution 
are taken into account. It can intuitively 
be seen that this test will be sensitive to 
light curves with narrow duty cycles, but 
insensitive to those with broad duty 
cycles. 



Figure 1 Power curves of the four different 

teats for uniformity. The light curve la * 

aasumed to be unlmodal. The subscripts for 


the x 7 -toat refer to the degrees of freedom. 

4) Watson's U 2 test: With as above, 
the statistic is computed ^as follows: 

U2 " | f | £u f - U - ( (2i-l)/(2N) ) + i3 2 + '/12N (8) 

This is a type of a mean square error with respect to the uniform distribution, so that the 
information of each phase is taken directly into account in the calculation of U 2 . 
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The procedure to determine which of these four tests is the best test, would be to find the test 
with the largest power. Since these tests are non-parametric, (except the Rayleigh test), one 
cannot expect to find a single test with the largest power for all choices of parameters in eq. (5). 
An indication of the relative performances of these tests are given in Figure 1, which was 
obtained through simulations of unimodal data. It can be seen that Watson's test is the best test 
for duty cycles larger than 10% and the x 2 -test is best for duty cycles less than 10%. In the latter 
case it can be seen thgt the power of the x 2 -test increases if the number of bins is increased. A 
good choice is K«1/8. At small duty cycles it can be seen that the Rayleigh test performs badly 
relative to any other test. These conclusions remain independent of the pulsed fraction p^. 

The question obviously arises whether one may use these tests for uniformity when the phases 
are not random. The answer is yes, but it applies only to those kind of tests where the 
distribution of the test statistic is insensitive (robust) with respect to deviations from 
randomness. This has been investigated for the four discussed tests by looking for a change in 
the critical values as T increases with respect to b. Fortunately these values did not change, so 
that these tests may be used for any relation between T and b. 


4. NON-PARAMETRIC DENSITY ESTIMATION OF LIGHT CURVES: Although a test for 
uniformity is a first step in identifying a source, the additional estimation of a light curve is very 
important. The usual method to display a light curve in y-ray astronomy, is to bin the data into 
a histogram. The disadvantage of this method is that it is dependent on bin positions and their 
sizes. A more correct way to display an estimate of the true unknown p.d.f., is through the use 
of a non-parametric density estimator. This method assumes that the data is random. Since the 
light curve is a periodic one, a good estimator would be a truncated Fourier series. This estimate 
and its standard error can easily be computed. The application to estimation on a circle is as 
follows: Let the random sample be D=(0j,... f (y with unknown p.d.f. f(0). The characteristic 
function (c.f.) of f(0) and its corresponding estimator are 
2tt 

*P “ (.'*<•>« = a p + 1 Bp and $ p = 5 p -Mg p =(± ^ cos pej + i (i }! slnpe,) (8) 
Using the inversion formula (Mardia, 1972) we obtain 


f( 0 ) - 2~(1 + ^pfii (otpCos pe + flp s in pe ) ) 

The following asymptotically unbiased estimator of f(0) is proposed: 

~ 1 m 

f (6; D, m) - 57 (1 + 2^ ( a p cos p0 + B p sin p0 ) ) (9) 

where m is some "smoothing parameter". Using the method of cross-validation (Bowman, 
1984), m can be estimated by m, where m is that value of m which minimizes 

.s [jj/ ^N-l (9;D i' m)d0 “ i ^N-l (0;D i' m) ^ With D i = ( V ,, ’' e i-l' 0 i+l'* , '' 0 N ) (10) 


The approximate confidence band of 
f(0) is 

f(0; D, in)± S /Tvar"f) (U) 

with s=1.96 being the quantity 
determining the 95% confidence limit. 
The probability that the true p.d.f will 
be within the band, will be 
approximately 95%. Figure 2 displays 
an example of these bands. One can 
thus use these bands, in their own 
fashion, to determine the significance 

* of periodic emission. For 8<<1, one 
may encounter the problem of 
oversmoothing. Tabulated values of rri 
for such cases will be presented by the 

* authors. 



Figure 1 The density estimator ?( 0 iD,ift) of f(0). 
The 95% confidence band is indicated by The 


ON- and OFF-fiource regions are also indicated. 

5. LIGHT CURVE PARAMETERS FROM THE DENSITY ESTIMATOR: Even if one does 
not have any knowledge of the true p.d.f f (0), it is still desirable to know the light curve 
parameters. Since the estimator is asymptotically unbiased, one may estimate the desired 
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parameters as follows:.' Determine the pulsed region 0^ and 0 2 roughly. Then determine the 
uniform background level c: 9j 2n 

c - I [ f(0)dO + [ f(O)d0] / l> - e 2 + e t j (13) 

I o > & 2 

Using this line of height c, determine a better estimate of the pulsed region. This may lead to a 
small improvement of c. Obtain the light curve parameters: 


| (?(0)-c)dO and p - e(f (6) -c)d0 / 


(14) 


•’a. n 

The duty cycle (FWHM) can be obtained graphically or numerically from the peak of the light 
Tho fattor nan nniv ho Hnno whftn a snerific source function S(0;u,S) is assumed: 


o 2 (6) 


L 


e 2 (f (9) -c)dO / p - p 2 


From these parameters one can obtain 
the significance of periodic emission in 
terms of the usual number of standard 
deviations NSIG from the uniform 
background. Using a normal distribution 
for S(0;p,8) and the interval p±1.96o 
(95% area under the normal curve for 
this interval) for the pulsed region, NSIG 
was computed for unimodal light curves 
with a 10% periodic signal. The results 
are presented in Figure 3. The latter can 
be used to determine the total number 
of events that is required to obtain a 
certain level of significance. From Figure 
3 it can be seen that the smaller the duty 
cycle, the easier it is to identify a source. 
This method can also be applied to 
bimodal light curves. 



SAMPLE SIZE N 

FI GURE 3 Contours of significance of periodic 


emission as function of the pulse'-th^ty cycle 
and sample size. 

6. CONCLUSIONS: When the phases are formed from the arrival times, great care should be 
taken if the periodic light curve and the corresponding parameters are to be estimated from the 
sample. In the first place analysis should be restricted to time independent processes (i.e. the 
form of the light curve should not change during the observation time). The next step would 
be to perform a test for the independency of the sample. The null- hypothesis of independency 
will usually be accepted for T<3b. This condition will usually also ensure that the sample 
variables (phases) are identically distributed if one let tx>>0. Under these conditions the sample 
will be random and the p.d.f. with its corresponding parameters can be estimated. Certain tests 
for uniformity, like those discussed in section 3, may be used whether the sample is random or 
not Watson's test seems to be the best test of those discussed for unimodal light curves with 
duty cycles larger than i0%, while the x 2 -test performs better at smaller duty cycles. The best 
choice for the number of bins in the x J -test is approximately 1 /5. The Rayleigh test is not a very 
dependable test since it is a parametric test that was derived for a very limited form of the light 
curve. 

Likelihood ratio tests for uniformity are presently being investigated by the authors. This will 
result in the best test for light curves of the form of eq. (5). Such an analysis would 
automatically present the light curve parameters with their corresponding standard errors. 
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